110 research outputs found

    S-MART, A Software Toolbox to Aid RNA-seq Data Analysis

    Get PDF
    High-throughput sequencing is now routinely performed in many experiments. But the analysis of the millions of sequences generated, is often beyond the expertise of the wet labs who have no personnel specializing in bioinformatics. Whereas several tools are now available to map high-throughput sequencing data on a genome, few of these can extract biological knowledge from the mapped reads. We have developed a toolbox called S-MART, which handles mapped RNA-Seq data. S-MART is an intuitive and lightweight tool which performs many of the tasks usually required for the analysis of mapped RNA-Seq reads. S-MART does not require any computer science background and thus can be used by all of the biologist community through a graphical interface. S-MART can run on any personal computer, yielding results within an hour even for Gb of data for most queries. S-MART may perform the entire analysis of the mapped reads, without any need for other ad hoc scripts. With this tool, biologists can easily perform most of the analyses on their computer for their RNA-Seq data, from the mapped data to the discovery of important loci

    BlastR—fast and accurate database searches for non-coding RNAs

    Get PDF
    We present and validate BlastR, a method for efficiently and accurately searching non-coding RNAs. Our approach relies on the comparison of di-nucleotides using BlosumR, a new log-odd substitution matrix. In order to use BlosumR for comparison, we recoded RNA sequences into protein-like sequences. We then showed that BlosumR can be used along with the BlastP algorithm in order to search non-coding RNA sequences. Using Rfam as a gold standard, we benchmarked this approach and show BlastR to be more sensitive than BlastN. We also show that BlastR is both faster and more sensitive than BlastP used with a single nucleotide log-odd substitution matrix. BlastR, when used in combination with WU-BlastP, is about 5% more accurate than WU-BlastN and about 50 times slower. The approach shown here is equally effective when combined with the NCBI-Blast package. The software is an open source freeware available from www.tcoffee.org/blastr.htm

    Characterization of 3D genomic interactions in fetal pig muscle

    Get PDF
    Genome sequence alone is not sufficient to explain the overall coordination of nuclear activity in a particular tissue. The nuclear organisation and genomic long-range intra- and inter-chromosomal interactions play an important role in the regulation of gene expression and the activation of tissue- specific gene networks. Here we present an overview of the pig genome architecture in muscle at two late developmental stages. The muscle maturation process occurs between the 90th day and the end of gestation (114 days), a key period for survival at birth. To characterise this period we profiled chromatin interactions genome-wide with in situ Hi-C (High Throughput Chromosome Conformation Capture) in muscle samples collected at 90 and 110 days of gestation, specific moments where a drastic change in gene expression has been reported. About 200 million read pairs per library were generated (3 replicates per condition). This allowed: (a) the design of an experimental Hi-C protocol optimized for frozen fetal tissues, (b) the first Hi-C contact heatmaps in fetal porcine muscle cells, and (c) to profile Topologically Associated Domains (TADs) defined as genomic domains with high levels of chromatin interactions. Using the new assembly version Sus scrofa v11, we could map 82% of the Hi-C reads on the reference genome. After filtering, 49% of valid read pairs were used to infer the genomic interactions in both developmental stages. In addition, ChIP-seq experiments were performed to map the binding of the structural protein CTCF, known to regulate genome structure by promoting interactions between genes and distal enhancers. The Hi-C and ChIP-seq data were analysed in combination with the results of a previous transcriptome analysis, focusing on the hun-dreds of genes that were reported as differentially expressed during muscle maturation. We will report the observed general differences between both developmental stages in terms of transcription and structure

    BlastR—fast and accurate database searches for non-coding RNAs

    Get PDF
    We present and validate BlastR, a method for efficiently and accurately searching non-coding RNAs. Our approach relies on the comparison of di-nucleotides using BlosumR, a new log-odd substitution matrix. In order to use BlosumR for comparison, we recoded RNA sequences into protein-like sequences. We then showed that BlosumR can be used along with the BlastP algorithm in order to search non-coding RNA sequences. Using Rfam as a gold standard, we benchmarked this approach and show BlastR to be more sensitive than BlastN. We also show that BlastR is both faster and more sensitive than BlastP used with a single nucleotide log-odd substitution matrix. BlastR, when used in combination with WU-BlastP, is about 5% more accurate than WU-BlastN and about 50 times slower. The approach shown here is equally effective when combined with the NCBI-Blast package. The software is an open source freeware available from www.tcoffee.org/blastr.html
    corecore